GPT-GNN: 图形神经网络的生成性预训练 | 网络科学论文速递35篇
本文由机器翻译,仅供参考,感兴趣请查阅论文原文
GPT-GNN: 图形神经网络的生成性预训练;
网络上 SIR 传染病的闭环推断、预测与控制框架;
财富分布的非普遍性反映在财富凝聚临界性附近;
基于 K-Means-LSTM 的新型冠状病毒肺炎确诊病例数预测;
生殖数 R_0 能告诉我们什么,不能告诉我们什么关于新型冠状病毒肺炎动力学;
城市网络化基础设施的复原力: 以供水系统为例;
幼儿社交媒体需求与需求的探索性研究;
不同时间重建颗粒细胞对小脑巴甫洛夫眨眼条件反射的影响;
超稳定旋转磁对流中的跃迁;
纠缠嵌入递归网络体系结构: 张量潜在状态传播与混沌预测;
肿瘤诱导血管生成中的二维孤子;
用微型拉伸流变仪研究软、活性物质;
探索多尺度蛋白质力学和组装的光镊方法;
发光二极管激发上转换显微镜: 一个定量的评估;
分布式同意及其对社会网络隐私和可观察性的影响;
异构网络系统的联合网络风险评估;
公共交通的无源 Wi-Fi 监测: 以 Madeira Island 为例;
Canarytrap: 检测在线社交网络中第三方应用程序的数据滥用;
局部网统计量的中心极限定理;
21魔兽世界的客户流失预测;
销售性别: 是什么决定了销售率和受欢迎程度? 对11500份在线档案的分析;
计算求爱: 通过大规模数据分析理解在线约会的进化;
社区检测和信息渗透在一个几何设置;
任意域上带潜变量的精确推理;
持续演化网络中持久活动的挖掘;
不可逾越: 多维复杂网络中基于最短路径的虚假路径的避免;
完全采用反馈下影响力最大化自适应间隙的优化界 ;
网络节点感知嵌入的社区结构;
通过小道消息学习: 噪音的影响和社会网络的广度和深度;
统一赋值下的分散竞争强盗: 支配还是删除;
与你的社区保持联系: 集群之间的桥梁触发了新型冠状病毒肺炎的扩张;
一种用于流行病控制的隐私保护测试优化算法;
人口迁移和间断封锁对 SARS-CoV-2传播的影响人口迁移和间断封锁对 SARS-CoV-2传播的影响;
高阶累积量的堆积修正;
从随机矩阵理论到泊松涨落的海洋表面温度全球相关矩阵谱;
GPT-GNN: 图形神经网络的生成性预训练
原文标题:
GPT-GNN: Generative Pre-Training of Graph Neural Networks
地址:
http://arxiv.org/abs/2006.15437
作者:
Ziniu Hu,Yuxiao Dong,Kuansan Wang,Kai-Wei Chang,Yizhou Sun
Abstract:Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs usually requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce the labeling effort is to pre-train an expressive GNN model on unlabeled data with self-supervision and then transfer the learned model to downstream tasks with only a few labels. In this paper, we present the GPT-GNN framework to initialize GNNs by generative pre-training. GPT-GNN introduces a self-supervised attributed graph generation task to pre-train a GNN so that it can capture the structural and semantic properties of the graph. We factorize the likelihood of the graph generation into two components: 1) Attribute Generation and 2) Edge Generation. By modeling both components, GPT-GNN captures the inherent dependency between node attributes and graph structure during the generative process. Comprehensive experiments on the billion-scale Open Academic Graph and Amazon recommendation data demonstrate that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.
摘要:图形神经网络(gnn)已被证明是建模图形结构化数据的强大工具。然而,训练 gnn 通常需要大量的特定于任务的标记数据,这通常是非常昂贵的获得。减少标记努力的一个有效方法是对未标记数据预先训练一个表达性的 GNN 模型,然后将学习模型转移到只有少量标记的下游任务。在本文中,我们提出了 GPT-GNN 框架来通过生成预训练来初始化 gnn。Gpt-GNN 引入了一个自监督的属性图生成任务来预先训练一个 GNN,使其能够捕获图的结构和语义特性。我们将图生成的可能性分解为两个部分: 1)属性生成和2)边生成。通过对两个组件进行建模,GPT-GNN 捕获了生成过程中节点属性和图结构之间的内在依赖性。对数十亿规模的 Open Academic Graph 和 Amazon 推荐数据进行的综合实验表明,GPT-GNN 在各种下游任务中,未经预先培训的性能明显优于最先进的 GNN 模型,达到9.1%。
网络上 SIR 传染病的
闭环推断、预测与控制框架
原文标题:
A Closed-Loop Framework for Inference, Prediction and Control of SIR Epidemics on Networks
地址:
http://arxiv.org/abs/2006.16185
作者:
Ashish R. Hota,Jaydeep Godbole,Pradhuman Bhariya,Philip E Paré
Abstract:Motivated by the ongoing pandemic COVID-19, we propose a closed-loop framework that combines inference from testing data, learning the parameters of the dynamics and optimal resource allocation for controlling the spread of the susceptible-infected-recovered (SIR) epidemic on networks. Our framework incorporates several key factors present in testing data, such as high risk individuals are more likely to undergo testing and infected individuals potentially act as asymptomatic carriers of the disease. We then present two tractable optimization problems to evaluate the trade-off between controlling the growth-rate of the epidemic and the cost of non-pharmaceutical interventions (NPIs). Our results provide compelling insights for policy-makers, including the significance of early testing and the emergence of a second wave of infections if NPIs are prematurely withdrawn.
摘要:基于正在进行的大流行新型冠状病毒肺炎,我们提出了一个闭环框架,它结合了测试数据的推断,学习动态参数和最优资源分配来控制网络上易感-感染-康复(SIR)流行病的传播。我们的框架结合了测试数据中的几个关键因素,例如高危人群更有可能接受测试,感染者可能是无症状的疾病携带者。然后,我们提出了两个易于处理的优化问题来评估控制疫情的增长率和非药物干预的成本之间的权衡。我们的研究结果为决策者提供了令人信服的见解,包括早期试验的重要性,以及如果非营利性医疗机构过早撤销,第二波感染的出现。
财富分布的非普遍性
反映在财富凝聚临界性附近
原文标题:
The non-universality of wealth distribution tails near wealth condensation criticality
地址:
http://arxiv.org/abs/2006.15008
作者:
Sam L. Polk,Bruce M. Boghosian
Abstract:In this work, we modify the Affine Wealth Model of wealth distributions to examine the effects of nonconstant redistribution on the very wealthy. Previous studies of this model, restricted to flat redistribution schemes, have demonstrated the presence of a phase transition to a partially wealth-condensed state, or "partial oligarchy", at the critical value of an order parameter. These studies have also indicated the presence of an exponential tail in wealth distribution precisely at criticality. Away from criticality, the tail was observed to be Gaussian. In this work, we generalize the flat redistribution within the Affine Wealth Model to allow for an essentially arbitrary redistribution policy. We show that the exponential tail observed near criticality in prior work is in fact a special case of a much broader class of critical, slower-than-Gaussian decays that depend sensitively on the corresponding asymptotic behavior of the progressive redistribution model used. We thereby demonstrate that the functional form of the tail of the wealth distribution of a near-critical society is not universal in nature, but rather is entirely determined by the specifics of public policy decisions. This is significant because most major economies today are observed to be near-critical.
摘要:在这项工作中,我们修改了财富分配的仿射财富模型,以检验非常富有的非常重新分配的影响。以往对这一模型的研究仅限于扁平的再分配方案,已经证明在一个序参数的临界值处存在向部分财富浓缩状态或“部分寡头政治”的阶段转变。这些研究还表明,存在一个指数尾在财富分布恰好在临界点。远离临界,尾巴被观察到为高斯。在这项工作中,我们推广了仿射财富模型中的平面再分配,以允许一个本质上任意的再分配政策。我们证明了在先验工作中观测到的临界附近的指数尾实际上是一个更广泛的临界,比高斯衰变的特殊情况,敏感地依赖于相应的渐近行为的渐进重新分配模型使用。因此,我们证明,一个近乎临界的社会的财富分配尾部的功能形式在性质上并不普遍,而是完全取决于公共政策决定的具体情况。这一点意义重大,因为据观察,目前大多数主要经济体都接近危急状态。
基于 K-Means-LSTM 的
新型冠状病毒肺炎确诊病例数预测
原文标题:
Prediction of the Number of COVID-19 Confirmed Cases Based on K-Means-LSTM
地址:
http://arxiv.org/abs/2006.14752
作者:
Shashank Reddy Vadyala,Sai Nethra Betgeri,Eric A. Sherer,Amod Amritphale
Abstract:COVID-19 is a pandemic disease that began to rapidly spread in the US with the first case detected on January 19, 2020, in Washington State. March 9, 2020, and then increased rapidly with total cases of 25,739 as of April 20, 2020. The Covid-19 pandemic is so unnerving that it is difficult to understand how any person is affected by the virus. Although most people with coronavirus 81%, according to the U.S. Centers for Disease Control and Prevention (CDC), will have little to mild symptoms, others may rely on a ventilator to breathe or not at all. SEIR models have broad applicability in predicting the outcome of the population with a variety of diseases. However, many researchers use these models without validating the necessary hypotheses. Far too many researchers often "overfit" the data by using too many predictor variables and small sample sizes to create models. Models thus developed are unlikely to stand validity check on a separate group of population and regions. The researcher remains unaware that overfitting has occurred, without attempting such validation. In the paper, we present a combination algorithm that combines similar days features selection based on the region using Xgboost, K Means, and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., K-Means-LSTM) for short-term COVID-19 cases forecasting in Louisana state USA. The weighted k-means algorithm based on extreme gradient boosting is used to evaluate the similarity between the forecasts and past days. The results show that the method with K-Means-LSTM has a higher accuracy with an RMSE of 601.20 whereas the SEIR model with an RMSE of 3615.83.
摘要:新型冠状病毒肺炎是一种大流行性疾病,2020年1月19日在华盛顿州发现第一例病例后开始在美国迅速传播。2020年3月9日,然后迅速增加,截至2020年4月20日,共有25,739个病例。新型冠状病毒肺炎大流行是如此令人不安,以至于很难理解任何人是如何受到这种病毒的影响的。根据美国疾病控制和预防中心(CDC)的数据,虽然大多数冠状病毒感染者中有81% 几乎没有出现轻微症状,但其他人可能依靠呼吸机呼吸或根本不呼吸。Seir 模型在预测多种疾病人群的预后方面具有广泛的适用性。然而,许多研究人员使用这些模型而没有验证必要的假设。太多的研究人员经常使用过多的预测变量和小样本量来建立模型,从而“过度拟合”数据。这样开发出来的模型不大可能对单独的一组人口和区域进行有效性检验。研究人员仍然不知道过拟合已经发生,没有尝试这样的验证。本文利用 Xgboost、 k Means 和长短期记忆(LSTM)神经网络,结合基于区域的相似日特征选择,提出了一种组合算法,建立了 Louisana 短期新型冠状病毒肺炎预测模型(即 k-Means-LSTM)。基于极值梯度提升的加权 k 均值算法用于评估预报与过去几天的相似性。结果表明,K-Means-LSTM 方法具有较高的精度,RMSE 为601.20,而 S
地址:
http://arxiv.org/abs/2006.14676
作者:
Clara L. Shaw,David A. Kennedy
Abstract:The reproductive number R_0 (and its value after initial disease emergence R) has long been used to predict the likelihood of pathogen invasion, to gauge the potential severity of an epidemic, and to set policy around interventions. However, often ignored complexities have generated confusion around use of the metric. This is particularly apparent with the emergent pandemic virus SARS-CoV-2, the causative agent of COVID-19. We address some of these misconceptions, namely, how R changes over time, varies over space, and relates to epidemic size by referencing the mathematical definition of R and examples from the current pandemic. We hope that a better appreciation of the uses, nuances, and limitations of R facilitates a better understanding of epidemic spread, epidemic severity, and the effects of interventions in the context of SARS-CoV-2.
摘要:繁殖数 R_0 (及其在疾病初发后的数值 r)长期以来被用来预测病原体入侵的可能性,衡量疫情的潜在严重程度,并围绕干预制定政策。然而,经常被忽视的复杂性已经在度量标准的使用方面产生了混乱。这在新出现的大流行病毒 SARS-CoV-2上表现得尤为明显,SARS-CoV-2是新型冠状病毒肺炎的病原体。通过引用 R 的数学定义和当前流行病的例子,我们解决了其中一些误解,即 R 随时间的变化,随空间的变化,以及与流行病规模的关系。我们希望,更好地了解 R 的用途、细微差别和局限性,有助于更好地理解 sars 流行病蔓延、流行病严重程度,以及在 SARS-CoV-2背景下干预措施的效果。
城市网络化基础设施的复原力:
以供水系统为例
原文标题:
Resilience in urban networked infrastructure: the case of Water Distribution Systems
地址:
http://arxiv.org/abs/2006.14622
作者:
Antonio Candelieri,Ilaria Giordani,Andrea Ponti,Francesco Archetti
Abstract:Resilience is meant as the capability of a networked infrastructure to provide its service even if some components fail: in this paper we focus on how resilience depends both on net-wide measures of connectivity and the role of a single component. This paper has two objectives: first to show how a set of global measures can be obtained using techniques from network theory, in particular how the spectral analysis of the adjacency and Laplacian matrices and a similarity measure based on Jensen-Shannon divergence allows us to obtain a characteriza-tion of global connectivity which is both mathematically sound and operational. Second, how a clustering method in the subspace spanned by the l smallest eigen-vectors of the Laplacian matrix allows us to identify the edges of the network whose failure breaks down the network. Even if most of the analysis can be applied to a generic networked infrastructure, specific references will be made to Water Distribution Networks (WDN).
摘要:韧性是指网络基础设施在某些组件失效的情况下提供服务的能力: 在本文中,我们重点讨论韧性如何既取决于全网连接度量,又取决于单个组件的作用。本文的主要目的有两个: 第一,利用网络理论中的技术得到一组全局测度,特别是邻接矩阵和拉普拉斯矩阵的谱分析以及基于 Jensen-Shannon 散度的相似性测度,使我们得到了全局连通性的一个数学上可靠和可操作的特征。其次,通过子空间中最小特征向量的聚类方法,我们可以识别出网络的边缘,而这些边缘的故障会导致网络的崩溃。这些特征向量是 Laplacian Matrix 的最小特征向量。即使大部分分析可以应用于一般的网络基础设施,也将特别引用水分配网络(WDN)。
幼儿社交媒体需求与需求的探索性研究
原文标题:
Exploratory Study of Young Children's Social Media Needs and Requirements
地址:
http://arxiv.org/abs/2006.14654
作者:
Di "Chelsea" Sun,Vaishnavi Melkote,Ahmed Sabbir Arif
Abstract:As social media are becoming increasingly popular among young children, it is important to explore this population's needs and requirements from these platforms. As a first step to this, we conducted an exploratory design workshop with children aged between ten and eleven years to find out about their social media needs and requirements. Through an analysis of the paper prototypes solicited from the workshop, here we discuss the social media features that are the most desired by this population.
摘要:随着社交媒体在幼儿中越来越受欢迎,从这些平台中探索这一群体的需求和要求非常重要。作为第一步,我们为10到11岁的儿童组织了一次探索性设计研讨会,以了解他们对社交媒体的需求和要求。通过对研讨会上收集到的纸上原型的分析,我们在这里讨论了这些人最想要的社交媒体特性。
不同时间重建颗粒细胞对小脑
巴甫洛夫眨眼条件反射的影响
原文标题:
Effect of Diverse Temporal Recoding of Granule Cells on Pavlovian Eyeblink Conditioning in The Cerebellum
地址:
http://arxiv.org/abs/2006.14933
作者:
Sang-Yoon Kim,Woochang Lim
Abstract:We consider the Pavlovian eyeblink conditioning (EBC) in the cerebellum via repeated presentation of paired conditioned stimulus (tone) and unconditioned stimulus (airpuff), and investigate the effect of diverse temporal recoding of granule (GR) cells on the EBC by varying the connection probability
摘要:我们考虑了条件刺激和非条件刺激成对呈现时小脑绿斑中的巴甫洛夫眨眼条件反射,并通过改变连接概率研究了颗粒细胞不同时间重建对绿斑绿斑的影响
超稳定旋转磁对流中的跃迁
原文标题:
Transitions in overstable rotating magnetoconvection
地址:
http://arxiv.org/abs/2006.14646
作者:
Ankan Banerjee,Manojit Ghosh,Pinaki Pal
Abstract:The classical Rayleigh-Bénard convection (RBC) system is known to exhibit either subcritical or supercritical transition to convection in the presence or absence of rotation and/or magnetic field. However, the simultaneous exhibition of subcritical and supercritical branches of convection in plane layer RBC depending on the initial conditions, has not been reported so far. Here, we report the phenomenon of simultaneous occurrence of subcritical and supercritical branches of convection in overstable RBC of electrically conducting low Prandtl number fluids (liquid metals) in the presence of an external uniform horizontal magnetic field and rotation about the vertical axis. Extensive three dimensional (3D) direct numerical simulations (DNS) and low dimensional modeling of the system, performed in the ranges
摘要:经典的 Rayleigh-Bénard 对流(RBC)系统在有无旋转和 / 或磁场的情况下,表现为亚临界或超临界的对流过渡。然而,在平面层红细胞中,由于初始条件的不同,亚临界和超临界红细胞对流的车轮战还没有报道。本文报道了导电低普朗特数流体(液态金属)在外加均匀水平磁场和绕垂直轴旋转的情况下,超稳定红细胞中亚临界和超临界对流支同时出现的现象。广泛的三维(3D)直接数值模拟(DNS)和低维建模的系统,在范围内执行
纠缠嵌入递归网络体系结构:
张量潜在状态传播与混沌预测
原文标题:
Entanglement-Embedded Recurrent Network Architecture: Tensorized Latent State Propagation and Chaos Forecasting
地址:
http://arxiv.org/abs/2006.14698
作者:
Xiangyi Meng,Tong Yang
Abstract:Chaotic time series forecasting has been far less understood despite its tremendous potential in theory and real-world applications. Traditional statistical/ML methods are inefficient to capture chaos in nonlinear dynamical systems, especially when the time difference
摘要:尽管混沌时间序列预测在理论和现实应用方面有着巨大的潜力,但人们对它的了解还远远不够。传统的统计 / ml 方法对非线性动态系统的混沌捕获效率很低,尤其是在时间差时
肿瘤诱导血管生成中的二维孤子
原文标题:
Two dimensional soliton in tumor induced angiogenesis
地址:
http://arxiv.org/abs/2006.16138
作者:
L. L. Bonilla,M. Carretero,F. Terragni
Abstract:Ensemble averages of a stochastic model show that, after a formation stage, the tips of active blood vessels in an angiogenic network form a moving two dimensional stable diffusive soliton, which advances toward sources of growth factor. Here we use methods of multiple scales to find the diffusive soliton as a solution of a deterministic equation for the mean density of active endothelial cells tips. We characterize the diffusive soliton shape in a general geometry, and find that its vector velocity and the trajectory of its center of mass along curvilinear coordinates solve appropriate collective coordinate equations. The vessel tip density predicted by the soliton compares well with that obtained by ensemble averages of simulations of the stochastic model.
摘要:随机模式的系综平均表明,在形成阶段之后,血管生成网络中的活动血管尖端形成一个移动的二维稳定扩散孤子,向生长因子源推进。在这里,我们使用多尺度的方法来寻找扩散孤子作为一个确定性方程的解决方案的平均密度的活跃内皮细胞的尖端。我们在一般的几何形状中刻画了扩散孤子的形状,发现它的矢量速度和它的质心沿曲线坐标系的轨迹可以解决适当的集体坐标方程。通过对随机模型的数值模拟,得到了孤子脉冲和系综平均脉冲的脉冲尖端密度。
用微型拉伸流变仪研究软、活性物质
原文标题:
Investigation of Soft and Living Matter using a Micro-Extensional Rheometer
地址:
http://arxiv.org/abs/2006.15958
作者:
Sushil Dubey,Sukh Veer,Seshagiri Rao R V,Chirag Kalelkar,Pramod A Pullarkat
Abstract:Rheological properties of a material often require to be probed under extensional deformation. Examples include fibrous materials such as spider-silk, high-molecular weight polymer melts, and the contractile response of living cells. Such materials have strong molecular-level anisotropies which are either inherent or are induced by an imposed extension. However, unlike shear rheology, which is well-established, techniques to perform extensional rheology are currently under development and setups are often custom-designed for the problem under study. In this article, we present a versatile device that can be used to conduct extensional deformation studies of samples at microscopic scales with simultaneous imaging. We discuss the operational features of this device and present a number of applications.
摘要:在拉伸变形下,材料的流变特性往往需要探测。例子包括纤维材料,如蜘蛛丝,高分子量聚合物熔体,和活细胞的收缩反应。这类材料具有强烈的分子水平各向异性,这些各向异性要么是固有的,要么是由外加的延伸引起的。然而,与已经成熟的剪切流变学不同的是,伸展流变学的实施技术目前正在发展之中,而且往往是为所研究的问题定制的。在本文中,我们提出了一个多功能的设备,可用于进行拉伸变形研究的样品在微观尺度与同步成像。我们讨论了这种装置的工作特点,并提出了一些应用。
探索多尺度蛋白质力学
和组装的光镊方法
原文标题:
Optical tweezers approaches for probing multiscale protein mechanics and assembly
地址:
http://arxiv.org/abs/2006.15841
作者:
Kathrin Lehmann,Marjan Shayegan,Gerhard A. Blab,Nancy R. Forde
Abstract:Multi-step assembly of individual protein building blocks is key to the formation of essential higher-order structures inside and outside of cells. Optical tweezers is a technique well suited to investigate the mechanics and dynamics of these structures at a variety of size scales. In this mini-review, we highlight experiments that have used optical tweezers to investigate protein assembly and mechanics, with a focus on the extracellular matrix protein collagen. These examples demonstrate how optical tweezers can be used to study mechanics across length scales, ranging from the single-molecule level to fibrils to protein networks. We discuss challenges in experimental design and interpretation, opportunities for integration with other experimental modalities, and applications of optical tweezers to current questions in protein mechanics and assembly.
摘要:单个蛋白质构建块的多步组装是细胞内外基本高阶结构形成的关键。这项技术非常适合于在不同尺度下研究这些结构的力学和动力学光镊。在这个小小的回顾中,我们重点介绍了一些实验,这些实验利用光镊来研究蛋白质的组装和机制,重点是细胞外间质蛋白质胶原蛋白。这些例子展示了光镊是如何被用来研究从单分子水平到纤维到蛋白质网络等各种长度尺度的力学问题的。我们讨论了实验设计和解释中的挑战,与其他实验模式整合的机会,以及光镊在蛋白质力学和组装中的应用。
发光二极管激发上转换显微镜:
一个定量的评估
原文标题:
Light-emitting diode excitation for upconversion microscopy: a quantitative assessment
地址:
https://arxiv.org/abs/2006.15783
作者:
Yueying Cao,Xianlin Zheng,Simone De Camillis,Bingyang Shi,James A. Piper,Nicolle H. Packer,Yiqing Lu
Abstract:Lanthanide-based upconversion nanoparticles (UCNPs) generally require high power laser excitation. Here we report wide-field upconversion microscopy at single-nanoparticle sensitivity using incoherent excitation of a 970-nm light-emitting diode (LED). We show that due to its broad emission spectrum, LED excitation is about 3 times less effective for UCNPs and generates high background compared to laser illumination. To counter this, we use time-gated luminescence detection to eliminate the residual background from the LED source, so that individual UCNPs with high sensitizer (Yb3+) doping and inert shell protection become clearly identified under LED excitation at 1.18 W cm-2, as confirmed by correlated electron microscopy images. Hydrophilic UCNPs are obtained by polysaccharide coating via a facile ligand exchange protocol to demonstrate imaging of cellular uptake using LED excitation. These results suggest a viable approach to bypassing the limitations associated with high-power lasers when applying UCNPs and upconversion microscopy to life science research.
摘要:镧系元素上转换纳米粒子(UCNPs)一般需要高功率激光激发。在这里,我们报告了使用970纳米发光二极管(LED)的非相干激发在单个纳米粒子敏感度下的宽场上转换显微镜。我们表明,由于其宽发射光谱,LED 激发对 UCNPs 的有效性约低3倍,并产生高背景相比激光照明。为了解决这一问题,我们使用时间门控发光检测来消除 LED 光源中的残余背景,以便在1.18 w cm-2的 LED 激发下,在高感光剂(Yb3 +)掺杂和惰性外壳保护作用下,个体 UCNPs 可以清晰地被识别出来,这一点已经得到了电子显微镜图像的证实。亲水性 UCNPs 是由多糖涂层通过一个简便的配体交换协议,证明成像细胞摄取使用 LED 激励。这些结果表明,在将 UCNPs 和上转换显微镜应用于生命科学研究时,一种可行的方法可以绕过与高功率激光器相关的限制。
分布式同意及其对
社会网络隐私和可观察性的影响
原文标题:
Distributed consent and its impact on privacy and observability in social networks
地址:
http://arxiv.org/abs/2006.16140
作者:
Juniper Lovato,Antoine Allard,Randall Harp,Laurent Hébert-Dufresne
Abstract:Personal data is not discrete in socially-networked digital environments. A single user who consents to allow access to their own profile can thereby expose the personal data of their network connections to non-consented access. The traditional (informed individual) consent model is therefore not appropriate in online social networks where informed consent may not be possible for all users affected by data processing and where information is shared and distributed across many nodes. Here, we introduce a model of "distributed consent" where individuals and groups can coordinate by giving consent conditional on that of their network connections. We model the impact of distributed consent on the observability of social networks and find that relatively low adoption of even the simplest formulation of distributed consent would allow macroscopic subsets of online networks to preserve their connectivity and privacy. Distributed consent is of course not a silver bullet, since it does not follow data as it flows in and out of the system, but it is one of the most straightforward non-traditional models to implement and it better accommodates the fuzzy, distributed nature of online data.
摘要:在社会网络的数字环境中,个人数据是不离散的。同意允许访问自己配置文件的单个用户可以将其网络连接的个人数据暴露给未经同意的访问。因此,传统的(知情的个人)同意模式不适用于在线社会网络,因为受数据处理影响的所有用户可能无法获得知情同意,而且信息是在许多节点之间共享和分发的。在这里,我们介绍了一个“分布式同意”的模型,其中个人和团体可以协调,给予同意的条件,他们的网络连接。我们建立了分布式同意对社会网络可观测性的影响模型,发现即使是最简单的分布式同意模型的采用率相对较低,也会允许在线网络的宏观子集保持其连通性和隐私性。分布式同意当然不是灵丹妙药,因为它不遵循数据进出系统,但它是最直接的非传统模型之一,而且它更好地适应了在线数据的模糊性和分布性。
异构网络系统的联合网络风险评估
原文标题:
Joint Cyber Risk Assessment of Network Systems with Heterogeneous Components
地址:
https://arxiv.org/abs/2006.16092
作者:
Gaofeng Da,Maochao Xu,Jingshi Zhang,Peng Zhao
Abstract:Cyber risks are the most common risks encountered by a modern network system. However, it is significantly difficult to assess the joint cyber risk owing to the network topology, risk propagation, and heterogeneities of components. In this paper, we propose a novel backward elimination approach for computing the joint cyber risk encountered by different types of components in a network system; moreover, explicit formulas are also presented. Certain specific network topologies including complete, star, and complete bi-partite topologies are studied. The effects of propagation depth and compromise probabilities on the joint cyber risk are analyzed using stochastic comparisons. The variances and correlations of cyber risks are examined by a simulation experiment. It was discovered that both variances and correlations change rapidly when the propagation depth increases from its initial value. Further, numerical examples are also presented.
摘要:网络风险是现代网络系统面临的最常见的风险。然而,由于网络拓扑、风险传播以及组件的不均匀性,联合网络风险的评估非常困难。本文提出了一种新的计算网络系统中不同类型组件遇到的联合网络风险的反向消除方法,并给出了显式公式。研究了某些特定的网络拓扑,包括完全拓扑、星型拓扑和完全二分拓扑。通过随机比较,分析了传播深度和折衷概率对联合网络风险的影响。通过模拟实验研究了网络风险的方差和相关性。研究发现,当传播深度从初始值增加时,方差和相关性都发生了迅速的变化。此外,还给出了数值算例。
公共交通的无源 Wi-Fi 监测:
以 Madeira Island 为例
原文标题:
Passive Wi-Fi Monitoring in Public Transport: A case study in the Madeira Island
地址:
http://arxiv.org/abs/2006.16083
作者:
Miguel Ribeiro,Bernardo Galvão,Catia Prandi,Nuno Nunes
Abstract:Transportation has become of evermore importance in the last years, affecting people's satisfaction and significantly impacting their quality of life. In this paper we present a low-cost infrastructure to collect passive Wi-Fi probes with the aim of monitoring, optimizing and personalizing public transport, towards a more sustainable mobility. We developed an embedded system deployed in 19 public transportation vehicles using passive Wi-Fi data. This data is analyzed on a per-vehicle and per-stop basis and compared against ground truth data (ticketing), while also using a method of estimating passenger exits, detecting peak loads on vehicles, and origin destination habits. As such, we argue that this data enables route optimization and provides local authorities and tourism boards with a tool to monitor and optimize the management of routes and transportation, identify and prevent accessibility issues, with the aim of improving the services offered to citizens and tourists, towards a more sustainable mobility.
摘要:在过去的年里,交通变得越来越重要,它影响着人们的满意度,并显著地影响着他们的生活质量。在本文中,我们提出了一个低成本的基础设施,以收集被动 Wi-Fi 探头,目的是监测,优化和个性化的公共交通,以实现更可持续的交通。我们开发了一个嵌入式系统,部署在19个公共交通工具使用被动无线网络数据。这些数据分析了每辆车和每个停靠站的基础上,并与地面真相数据(票务)进行了比较,同时还使用了估计乘客出口、检测车辆峰值载荷和出发目的地习惯的方法。因此,我们认为,这些数据有助于优化路线,并为地方当局和旅游局提供一个工具,以监测和优化路线和交通管理,查明和防止无障碍问题,目的是改善向公民和游客提供的服务,实现更可持续的流动性。
Canarytrap:
检测在线社交网络中
第三方应用程序的数据滥用
原文标题:
CanaryTrap: Detecting Data Misuse by Third-Party Apps on Online Social Networks
地址:
http://arxiv.org/abs/2006.15794
作者:
Shehroze Farooqi,Maaz Musa,Zubair Shafiq,Fareed Zaffar
Abstract:Online social networks support a vibrant ecosystem of third-party apps that get access to personal information of a large number of users. Despite several recent high-profile incidents, methods to systematically detect data misuse by third-party apps on online social networks are lacking. We propose CanaryTrap to detect misuse of data shared with third-party apps. CanaryTrap associates a honeytoken to a user account and then monitors its unrecognized use via different channels after sharing it with the third-party app. We design and implement CanaryTrap to investigate misuse of data shared with third-party apps on Facebook. Specifically, we share the email address associated with a Facebook account as a honeytoken by installing a third-party app. We then monitor the received emails and use Facebook's ad transparency tool to detect any unrecognized use of the shared honeytoken. Our deployment of CanaryTrap to monitor 1,024 Facebook apps has uncovered multiple cases of misuse of data shared with third-party apps on Facebook including ransomware, spam, and targeted advertising.
摘要:在线社交网络支持一个充满活力的第三方应用生态系统,这些应用可以访问大量用户的个人信息。尽管最近发生了几起备受瞩目的事件,但是缺乏系统地检测第三方应用在线社交网络上滥用数据的方法。我们建议使用 CanaryTrap 来检测与第三方应用程序共享的数据的误用。Canarytrap 将一个 honeytoken 关联到一个用户帐户,然后在与第三方应用程序共享之后,通过不同的渠道监视其未被识别的用途。我们设计并实现了 CanaryTrap 来调查 Facebook 上与第三方应用程序共享的数据的滥用。具体来说,我们通过安装第三方应用程序,将与 Facebook 帐户关联的电子邮件地址作为一个蜜罐共享。然后我们监控收到的邮件,并使用 Facebook 的广告透明化工具来检测任何未被识别的使用共享蜜罐的情况。我们部署了 CanaryTrap 来监控1024个 Facebook 应用程序,发现了多起滥用第三方应用程序在 Facebook 上共享的数据的案例,包括勒索软件、垃圾邮件和定向广告。
局部网统计量的中心极限定理
原文标题:
Central limit theorems for local network statistics
地址:
http://arxiv.org/abs/2006.15738
作者:
P-A. Maugis
Abstract:Subgraph counts - in particular the number of occurrences of small shapes such as triangles - characterize properties of random networks, and as a result have seen wide use as network summary statistics. However, subgraphs are typically counted globally, and existing approaches fail to describe vertex-specific characteristics. On the other hand, rooted subgraph counts - counts focusing on any given vertex's neighborhood - are fundamental descriptors of local network properties. We derive the asymptotic joint distribution of rooted subgraph counts in inhomogeneous random graphs, a model which generalizes many popular statistical network models. This result enables a shift in the statistical analysis of large graphs, from estimating network summaries, to estimating models linking local network structure and vertex-specific covariates. As an example, we consider a school friendship network and show that local friendship patterns are significant predictors of gender and race.
摘要:子图计数——特别是诸如三角形等小形状出现的次数——描述了随机网络的特性,因此被广泛用作网络汇总统计数据。然而,子图通常是全局计数的,现有的方法无法描述顶点特定的特征。另一方面,有根子图计数是局部网络性质的基本描述符,它关注于任意给定顶点的邻域。本文推导了非齐次随机图中有根子图计数的渐近联合分布,该模型推广了许多流行的统计网络模型。这个结果使大型图的统计分析转变,从估计网络摘要,到估计模型连接本地网络结构和顶点特定的协变量。作为一个例子,我们考虑一个学校的友谊网络,并表明当地的友谊模式是性别和种族的重要预测因素。
21魔兽世界的客户流失预测
原文标题:
Predicting Customer Churn in World of Warcraft
地址:
http://arxiv.org/abs/2006.15735
作者:
Sulman Khan
Abstract:In this paper, we explore a dataset that focuses on one year from January 1, 2008, until December 31, 2008, as it highlights the release of a major content update in the game. Machine learning is used in two aspects of this paper: Survival Analysis and Binary Classification. Firstly, we explore the dataset using the Kaplan Meier estimator to predict the duration until a customer churns, and lastly predict whether a person will churn in six months using traditional machine learning algorithms such as Logistic Regression, Support Vector Machine, KNN Classifier, and Random Forests. From the survival analysis results, WoW customers have a relatively long duration until churn, which solidifies the addictiveness of the game. Lastly, the binary classification performed in the best performing algorithm having a 96% ROC AUC score in predicting whether a customer will churn in six months.
摘要:在这篇论文中,我们探索了一个数据集,它聚焦于从2008年1月1日到2008年12月31日的一年,因为它突出了游戏中一个主要内容更新的发布。本文将机器学习应用于两个方面: 生存分析和二分类。首先,我们使用 Kaplan Meier 估计器探索数据集来预测客户流失的持续时间,最后使用传统的机器学习算法来预测一个人是否会在六个月内流失,比如 Logit模型、支持向量机、 KNN 分类器和随机森林。从生存分析的结果来看,魔兽世界的用户在客户流失之前有一段相对较长的时间,这巩固了游戏的吸引力。最后,二进制分类执行的表现最好的算法有96% ROC AUC 分数在预测客户是否会在六个月内翻箱倒柜。
销售性别:
是什么决定了销售率和受欢迎程度?
对11500份在线档案的分析
原文标题:
Selling sex: what determines rates and popularity? An analysis of 11.5 thousand online profiles
地址:
http://arxiv.org/abs/2006.15648
作者:
Alicia Mergenthaler,Taha Yasseri
Abstract:Sex work, or the exchange of sexual services for money or goods, is ubiquitous across eras and cultures. However, the practice of selling sex is often hidden due to stigma and the varying legal status of sex work. Online platforms that sex workers use to advertise services have become an increasingly important tool in studying a market that is largely hidden. Although prior literature has primarily shed light on sex work from a public health or policy perspective (focusing largely on female sex workers), there are few studies that empirically research patterns of service provision in online sex work. Little research has been done on understanding pricing and popularity in the market for commercial sex work. This study investigates the determinants of pricing and popularity in the market for commercial sexual services online by using data from the largest UK network of online sexual services, a platform that is the "industry-standard" for sex workers. While the size of these influences vary across genders, nationality, age, and services provided are shown to be primary drivers of rates and popularity in sex work.
摘要:性工作,或用性服务交换金钱或商品,在不同的时代和文化中无处不在。然而,由于性工作的耻辱和不同的法律地位,卖淫行为往往被隐藏起来。性工作者用来宣传服务的在线平台已经成为一个越来越重要的工具,用来研究一个很大程度上隐藏起来的市场。尽管先前的文献主要从公共卫生或政策角度阐明了性工作(主要侧重于女性性工作者) ,但很少有研究对网上性工作服务提供模式进行实证研究。关于了解商业性工作的定价和市场流行程度的研究很少。本研究利用英国最大的在线性服务网络的数据,调查了商业性服务在线市场定价和受欢迎程度的决定因素,该网络是性工作者的”行业标准”平台。虽然这些影响的大小因性别、国籍、年龄和所提供的服务而异,但这些都是性工作比例和受欢迎程度的主要驱动因素。
计算求爱:
通过大规模数据分析
理解在线约会的进化
原文标题:
Computational Courtship: Understanding the Evolution of Online Dating through Large-scale Data Analysis
地址:
http://arxiv.org/abs/1809.10032
作者:
Rachel Dinh,Patrick Gildersleve,Chris Blex,Taha Yasseri
Abstract:Have we become more tolerant of dating people of different social backgrounds compared to ten years ago? Has the rise of online dating exacerbated or alleviated gender inequalities in modern courtship? Are the most attractive people on these platforms necessarily the most successful? In this work, we examine the mate preferences and communication patterns of male and female users of the online dating site eHarmony over the past decade to identify how attitudes and behaviors have changed over this time period. While other studies have investigated disparities in user behavior between male and female users, this study is unique in its longitudinal approach. Specifically, we analyze how men and women differ in their preferences for certain traits in potential partners and how those preferences have changed over time. The second line of inquiry investigates to what extent physical attractiveness determines the rate of messages a user receives, and how this relationship varies between men and women. Thirdly, we explore whether online dating practices between males and females have become more equal over time or if biases and inequalities have remained constant (or increased). Fourthly, we study the behavioural traits in sending and replying to messages based on one's own experience of receiving messages and being replied to. Finally, we found that similarity between profiles is not a predictor for success except for the number of children and smoking habits. This work could have broader implications for shifting gender norms and social attitudes, reflected in online courtship rituals. Apart from the data-based research, we connect the results to existing theories that concern the role of ICTs in societal change. As searching for love online becomes increasingly common across generations and geographies, these findings may shed light on how people can build relationships through the Internet.
摘要:与十年前相比,我们是否变得更能容忍与不同社会背景的人约会?网上约会的兴起是否加剧或减轻了现代求爱中的性别不平等?这些平台上最有吸引力的人一定是最成功的吗?在这项研究中,我们调查了在线交友网站 eHarmony 的男性和女性用户在过去十年中的择偶偏好和交流模式,以确定在这段时间内态度和行为是如何改变的。虽然其他研究已经调查了男性和女性用户之间的用户行为差异,但这项研究的纵向研究方法是独特的。具体来说,我们分析了男性和女性对潜在伴侣的某些特征的偏好是如何不同的,以及这些偏好是如何随着时间变化的。第二个问题是调查体征美在多大程度上决定了用户收到信息的频率,以及这种关系在男性和女性之间的差异。第三,我们探讨的是,随着时间的推移,男性和女性之间的在线约会实践是否变得更加平等,或者偏见和不平等是否保持不变(或者增加)。第四,我们根据个人接收和回复邮件的经验,研究发送和回复邮件的行为特征。最后,我们发现,除了孩子的数量和吸烟习惯之外,个人资料之间的相似性并不是成功的预测因素。这项工作可能对转变性别规范和社会态度有更广泛的影响,反映在网上求爱仪式上。除了基于数据的研究之外,我们还将研究结果与有关信息和通信技术在社会变革中的作用的现有理论联系起来。随着在网上寻找爱情变得越来越普遍,这些发现可能会为人们如何通过互联网建立关系提供线索。
社区检测和信息渗透在一个几何设置
原文标题:
Community detection and percolation of information in a geometric setting
地址:
http://arxiv.org/abs/2006.15574
作者:
Ronen Eldan,Dan Mikulincer,Hester Pieters
Abstract:We make the first steps towards generalizing the theory of stochastic block models, in the sparse regime, towards a model where the discrete community structure is replaced by an underlying geometry. We consider a geometric random graph over a homogeneous metric space where the probability of two vertices to be connected is an arbitrary function of the distance. We give sufficient conditions under which the locations can be recovered (up to an isomorphism of the space) in the sparse regime. Moreover, we define a geometric counterpart of the model of flow of information on trees, due to Mossel and Peres, in which one considers a branching random walk on a sphere and the goal is to recover the location of the root based on the locations of leaves. We give some sufficient conditions for percolation and for non-percolation of information in this model.
摘要:我们迈出了第一步,以推广的理论随机块模型,在稀疏的制度,走向一个模型,其中离散的社区结构取代了一个基本的几何。我们考虑齐次度量空间上的一个几何随机图,其中两个顶点连通的概率是距离的任意函数。我们给出了在稀疏区域中位置恢复的充分条件(直到空间的同构)。此外,我们定义了树上信息流模型的一个几何对应物,这是由于 Mossel 和佩雷斯,在这个几何对应物中,我们考虑了球面上的一个分支随机游动,目标是基于叶子的位置恢复根的位置。给出了该模型中信息不逾渗和逾渗的充分条件。
任意域上带潜变量的精确推理
原文标题:
Exact Inference with Latent Variables in an Arbitrary Domain
地址:
http://arxiv.org/abs/1902.03099
作者:
Chuyang Ke,Jean Honorio
Abstract:We analyze the necessary and sufficient conditions for exact inference of a latent model. In latent models, each entity is associated with a latent variable following some probability distribution. The challenging question we try to solve is: can we perform exact inference without observing the latent variables, even without knowing what the domain of the latent variables is? We show that exact inference can be achieved using a semidefinite programming (SDP) approach without knowing either the latent variables or their domain. Our analysis predicts the experimental correctness of SDP with high accuracy, showing the suitability of our focus on the Karush-Kuhn-Tucker (KKT) conditions and the spectrum of a properly defined matrix. As a byproduct of our analysis, we also provide concentration inequalities with dependence on latent variables, both for bounded moment generating functions as well as for the spectra of matrices. To the best of our knowledge, these results are novel and could be useful for many other problems.
摘要:分析了潜在模型精确推理的充要条件。在潜在模型中,每个实体都与某个概率分布后的潜在变量相关联。我们试图解决的一个具有挑战性的问题是: 我们能否在不观察潜变量的情况下执行精确推理,即使不知道潜变量的领域是什么?我们证明了在不知道潜变量及其域的情况下,半定规划方法可以实现精确推理。我们的分析高精度地预测了 SDP 的实验正确性,表明我们对 Karush-Kuhn-Tucker (KKT)条件和正确定义的矩阵谱的关注是适宜的。作为分析的副产品,我们还提供了依赖于潜变量的浓度不等式,包括有界矩母函数和矩阵谱。据我们所知,这些结果是新颖的,可能对许多其他问题有用。
持续演化网络中持久活动的挖掘
原文标题:
Mining Persistent Activity in Continually Evolving Networks
地址:
http://arxiv.org/abs/2006.15410
作者:
Caleb Belth,Xinyi Zheng,Danai Koutra
Abstract:Frequent pattern mining is a key area of study that gives insights into the structure and dynamics of evolving networks, such as social or road networks. However, not only does a network evolve, but often the way that it evolves, itself evolves. Thus, knowing, in addition to patterns' frequencies, for how long and how regularly they have occurred---i.e., their persistence---can add to our understanding of evolving networks. In this work, we propose the problem of mining activity that persists through time in continually evolving networks---i.e., activity that repeatedly and consistently occurs. We extend the notion of temporal motifs to capture activity among specific nodes, in what we call activity snippets, which are small sequences of edge-updates that reoccur. We propose axioms and properties that a measure of persistence should satisfy, and develop such a persistence measure. We also propose PENminer, an efficient framework for mining activity snippets' Persistence in Evolving Networks, and design both offline and streaming algorithms. We apply PENminer to numerous real, large-scale evolving networks and edge streams, and find activity that is surprisingly regular over a long period of time, but too infrequent to be discovered by aggregate count alone, and bursts of activity exposed by their lack of persistence. Our findings with PENminer include neighborhoods in NYC where taxi traffic persisted through Hurricane Sandy, the opening of new bike-stations, characteristics of social network users, and more. Moreover, we use PENminer towards identifying anomalies in multiple networks, outperforming baselines at identifying subtle anomalies by 9.8-48% in AUC.
摘要:频繁模式挖掘是一个关键的研究领域,它使我们深入了解不断演化的网络的结构和动态,如社会网络或道路网络。然而,一个网络不仅在进化,而且往往是它进化的方式,它自己也在进化。因此,除了模式的频率之外,了解它们发生的时间和规律---- 也就是它们的持久性---- 可以增加我们对进化中的网络的理解。在这项工作中,我们提出了在不断发展的网络中持续存在的挖掘活动的问题---- 也就是说,活动不断地、持续地发生。我们扩展了时间序列的概念来捕捉特定节点之间的活动,我们称之为活动片段,这是重复出现的边缘更新的小序列。我们提出了持久性度量应该满足的公理和属性,并开发了这样一个持久性度量。我们还提出了 PENminer,一个在演化网络中挖掘活动片段持久性的有效框架,并设计了离线和流式算法。我们将 PENminer 应用于许多真实的、大规模演化的网络和边缘流,并发现了在很长一段时间内令人惊讶地有规律的活动,但是仅仅通过聚合计数是无法发现这些活动的,以及由于缺乏持久性而暴露出来的活动爆发。我们在 PENminer 上的发现包括纽约市的一些社区,在飓风桑迪持续不断的出租车交通,新的自行车站的开放,社交网络用户的特点等等。此外,我们使用 PENminer 识别异常在多个网络,优于基线在识别微妙的异常在 AUC 9.8-48%。
不可逾越:
多维复杂网络中
基于最短路径的虚假路径的避免
原文标题:
You Shall not Pass: Avoiding Spurious Paths in Shortest-Path Based Centralities in Multidimensional Complex Networks
地址:
http://arxiv.org/abs/2006.15401
作者:
Klaus Wehmuth,Artur Ziviani,Leonardo Chinelate Costa,Ana Paula Couto da Silva,Alex Borges Vieira
Abstract:In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view of such multidimensional (high order) networks. Consequently, these spurious paths may then cause shortest-path based centrality metrics to produce incorrect results, thus undermining the network centrality analysis. In this context, we propose a method able to avoid taking into account spurious paths when computing centralities based on shortest paths in multidimensional (or high order) networks. Our method is based on MultiAspect Graphs (MAG) to represent the multidimensional networks and we show that well-known centrality algorithms can be straightforwardly adapted to the MAG environment. Moreover, we show that, by using this MAG representation, pitfalls usually associated with spurious paths resulting from aggregation in multidimensional networks can be avoided at the time of the aggregation process. As a result, shortest-path based centralities are assured to be computed correctly for multidimensional networks, without taking into account spurious paths that could otherwise lead to incorrect results. We also present a case study that shows the impact of spurious paths in the computing of shortest paths and consequently of shortest-path based centralities, thus illustrating the importance of this contribution.
摘要:在复杂网络分析中,基于最短路径的中心被广泛应用。最近,许多复杂系统被时变、多层和时变的多层网络表示,即多维(或高阶)网络。然而,众所周知,聚合过程可能在这种多维(高阶)网络的聚合视图上创建虚假路径。因此,这些虚假路径可能导致基于最短路径的中心性度量产生错误的结果,从而破坏了网络中心性分析。在这种情况下,我们提出了一种方法,能够避免考虑虚假路径时,计算中心的多维(或高阶)网络的最短路径。我们的方法是基于多方面图(MAG)来表示多维网络,我们证明了众所周知的中心性算法可以直接适用于 MAG 环境。此外,我们还证明,通过使用这种 MAG 表示,在聚合过程中可以避免通常与多维网络中聚合产生的虚假路径相关的陷阱。因此,基于最短路径的集中度可以确保正确计算多维网络,而不必考虑可能导致错误结果的虚假路径。我们还提出了一个案例研究,显示了虚假路径在计算最短路径,因此基于最短路径集中的影响,从而说明了这一贡献的重要性。
完全采用反馈下影响力
最大化自适应间隙的优化界
原文标题:
Better Bounds on the Adaptivity Gap of Influence Maximization under Full-adoption Feedback
地址:
http://arxiv.org/abs/2006.15374
作者:
Gianlorenzo D'Angelo,Debashmita Poddar,Cosimo Vinci
Abstract:In the influence maximization (IM) problem, we are given a social network and a budget
Our main result is the first sub-linear upper bound that holds for any graph. Specifically, we show that the adaptivity gap is upper-bounded by
摘要:在影响最大化(IM)问题中,我们给出了一个社会网络和一个预算在影响最大化(IM)问题中,我们给出了一个社会网络和一个预算
网络节点感知嵌入的社区结构
原文标题:
Community Structure aware Embedding of Nodes in a Network
地址:
http://arxiv.org/abs/2006.15313
作者:
Swarup Chattopadhyay,Debasis Ganguly
Abstract:Detecting communities or the modular structure of real-life networks (e.g. a social network or a product purchase network) is an important task because the way a network functions is often determined by its communities. Traditional approaches to community detection involve modularity-based algorithms, which generally speaking, construct partitions based on heuristics that seek to maximize the ratio of the edges within the partitions to those between them. On the other hand, node embedding approaches represent each node in a graph as a real-valued vector and is thereby able to transform the problem of community detection in a graph to that of clustering a set of vectors. Existing node embedding approaches are primarily based on, first, initiating random walks from each node to construct a context of a node, and then make the vector representation of a node close to its context. However, standard node embedding approaches do not directly take into account the community structure of a network while constructing the context around each node. To alleviate this, we explore two different threads of work. First, we investigate the use of maximum entropy-based random walks to obtain more centrality preserving embedding of nodes, which may lead to more effective clusters in the embedded space. Second, we propose a community structure-aware node embedding approach, where we incorporate modularity-based partitioning heuristics into the objective function of node embedding. We demonstrate that our proposed combination of the combinatorial and the embedding approaches for community detection outperforms a number of modularity-based baselines and K-means clustering on a standard node-embedded (node2vec) vector space on a wide range of real-life and synthetic networks of different sizes and densities.
摘要:检测社区或现实生活中网络的模块结构(例如社会网络或产品购买网络)是一项重要的任务,因为网络的运作方式往往由其社区决定。传统的社区检测方法涉及基于模块化的算法,一般来说,这种算法基于启发式算法构造分区,寻求最大化分区内部边与分区之间边的比例。另一方面,节点嵌入方法将图中的每个节点表示为一个实值向量,从而能够将图中的社区检测问题转化为一组向量的聚类问题。现有的节点嵌入方法主要基于,首先从每个节点启动随机漫游来构造节点的上下文,然后使节点的矢量表示接近其上下文。然而,标准的节点嵌入方法在构造每个节点的上下文时并没有直接考虑到网络的社区结构。为了解决这个问题,我们探讨了两种不同的工作思路。首先,我们研究使用基于最大熵的随机游动来保持节点嵌入的中心性,从而在嵌入空间中产生更有效的簇。其次,我们提出了一种社区结构感知的节点嵌入方法,在节点嵌入的目标函数中引入了基于模块化的分割启发式算法。我们证明,我们提出的组合和嵌入方法的社区检测组合优于一个标准的节点嵌入(node2vec)向量空间上的模块化基线和 K平均算法,在一个不同规模和密度的实际生活和合成网络的范围广泛。
通过小道消息学习:
噪音的影响和社会网络的广度和深度
原文标题:
Learning through the Grapevine: The Impact of Noise and the Breadth and Depth of Social Networks
地址:
http://arxiv.org/abs/1812.03354
作者:
Matthew O. Jackson,Suraj Malladi,David McAdams
Abstract:We examine how well people learn when information is noisily relayed from person to person; and we study how communication platforms can improve learning without censoring or fact-checking messages. We analyze learning as a function of social network depth (how many times information is relayed) and breadth (the number of relay chains accessed). Noise builds up as depth increases, so learning requires greater breadth. In the presence of mutations (deliberate or random) and transmission failures of messages, we characterize sharp thresholds for breadths above which receivers learn fully and below which they learn nothing. When there is uncertainty about mutation rates, optimizing learning requires either capping depth, or if that is not possible, limiting breadth by capping the number of people to whom someone can forward a message. Limiting breadth cuts the number of messages received but also decreases the fraction originating further from the receiver, and so can increase the signal to noise ratio. Finally, we extend our model to study learning from message survival: e.g., people are more likely to pass messages with one conclusion than another. We find that as depth grows, all learning comes from either the total number of messages received or from the content of received messages, but the learner does not need to pay attention to both.
摘要:我们研究当信息从一个人传到另一个人时,人们的学习效果如何; 我们研究交流平台如何在不审查或核实信息的情况下提高学习效果。我们分析学习作为一个函数的社会网络深度(多少次信息中继)和广度(中继链接的数量)。随着深度的增加,噪音也会增加,所以学习需要更大的广度。在突变(有意或随机)和信息传输失败的情况下,我们描述了接收者充分学习和低于他们什么也学不到的宽度的尖锐阈值。当突变率存在不确定性时,优化学习要么需要限制深度,要么不可能,通过限制某人可以转发信息的人数来限制广度。限制宽度会减少接收到的消息数量,但也会进一步减少来自接收器的部分,因此可以增加信噪比。最后,我们将我们的模型扩展到研究从信息生存中学习: 例如,人们更倾向于传递一个结论而不是另一个结论。我们发现,随着深度的增加,所有的学习都来自于接收到的信息的总数或者来自于接收到的信息的内容,但是学习者不需要同时关注这两者。
统一赋值下的分散竞争强盗:
支配还是删除
原文标题:
Dominate or Delete: Decentralized Competing Bandits with Uniform Valuation
地址:
http://arxiv.org/abs/2006.15166
作者:
Abishek Sankararaman,Soumya Basu,Karthik Abinav Sankararaman
Abstract:We study regret minimization problems in a two-sided matching market where uniformly valued demand side agents (a.k.a. agents) continuously compete for getting matched with supply side agents (a.k.a. arms) with unknown and heterogeneous valuations. Such markets abstract online matching platforms (for e.g. UpWork, TaskRabbit) and falls within the purview of matching bandit models introduced in Liu et al. \cite{matching_bandits}. The uniform valuation in the demand side admits a unique stable matching equilibrium in the system. We design the first decentralized algorithm - \fullname\; (\name), for matching bandits under uniform valuation that does not require any knowledge of reward gaps or time horizon, and thus partially resolves an open question in \cite{matching_bandits}. \name\; works in phases of exponentially increasing length. In each phase
摘要:本文研究了双边匹配市场中的遗憾最小化问题,在双边匹配市场中,价值一致的需求方代理人(又称代理人)不断地与价值不确定的供应方代理人(又称武器代理人)竞争获得匹配。这些市场抽象出在线匹配平台(例如 UpWork、 TaskRabbit) ,并且属于 Liu 等人引用的配对土匪模型的范畴。需求侧的统一定价使系统中存在唯一的稳定匹配均衡。我们设计了第一个分散算法——全名; (名称) ,用于在统一估值下匹配土匪,不需要任何关于悬赏间隔或时间范围的知识,从而部分解决了在引用{ matching _ bandits }中的一个开放性问题。工作在指数增长的长度的阶段。在每个阶段
与你的社区保持联系:
集群之间的桥梁触发了
新型冠状病毒肺炎的扩张
原文标题:
Stay with Your Community: Bridges between Clusters Trigger Expansion of COVID-19
地址:
http://arxiv.org/abs/2006.16047
作者:
Yukio Ohsawa,Masaharu Tsubokura
Abstract:The spreading of virus infection is here simulated over artificial human networks. The real-space urban life of people is modeled as a modified scale-free network with constraints. A scale-free network has been adopted in several studies for modeling on-line communities so far but is modified here for the aim to represent peoples' social behaviors where the generated communities are restricted reflecting the spatiotemporal constraints in the real life. Furthermore, the networks have been extended by introducing multiple cliques in the initial step of network construction and enabling people to zero-degree people as well as popular (large degree) people. As a result, four findings and a policy proposal have been obtained. First, the "second waves" occur without external influence or constraints on contacts or the releasing of the constraints. These second waves, mostly lower than the first wave, implies the bridges between infected and fresh clusters may trigger new expansions of spreading. Second, if the network changes the structure on the way of infection spreading or after its suppression, the peak of the second wave can be larger than the first. Third, the peak height in the time series depends on the difference between the upper bound of the number of people each member accepts to meet and the number of people one chooses to meet. This tendency is observed for two kinds of artificial networks and implies the impact of the bridges between communities on the virus spreading. Fourth, the release of once given constraint may trigger a second wave higher than the peak of the time series without introducing any constraint from the beginning, if the release is introduced at a time close to the peak. Thus, both governments and individuals should be careful in returning to human society with inter-community contacts.
摘要:这里模拟了病毒感染在人工网络上的传播。人们真实的城市生活被模拟为一个有约束的修改过的无尺度网络。到目前为止,一些研究已经采用了一个无尺度网络模型来模拟在线社区,但是在这里被修改,目的是代表人们的社会行为,其中生成的社区受到限制,反映了现实生活中的时空约束。此外,通过在网络建设的初始阶段引入多个小集团,使人们既可以成为零度人群,也可以成为受欢迎的(大度)人群,扩大了网络的范围。因此,我们得出了四项研究结果和一项政策建议。首先,“第二波”发生时没有外界的影响或约束,也没有对接触或约束的释放。这些第二波,大多数低于第一波,意味着感染和新的群集之间的桥梁可能会引发新的扩散。其次,如果网络在传播途径上或抑制后改变结构,则第二波的峰值可能大于第一波。第三,时间序列中的峰值高度取决于每个成员接受会面的人数上限和选择会面的人数之间的差异。这种趋势在两种人工网络中都可以观察到,这意味着群落之间的桥梁对病毒传播的影响。第四,释放一次给定的约束可能触发比时间序列峰值更高的第二波,如果释放时间接近峰值,则从一开始就不引入任何约束。因此,无论是政府还是个人,在回归人类社会的过程中都应该小心谨慎。
一种用于流行病控制的
隐私保护测试优化算法
原文标题:
A privacy-preserving tests optimization algorithm for epidemics containment
地址:
https://arxiv.org/abs/2006.15977
作者:
Alessandro Nuara,Francesco Trovò,Nicola Gatti
Abstract:The SARS-CoV-2 outbreak changed the everyday life of almost all the people over the world.Currently, we are facing with the problem of containing the spread of the virus both using the more effective forced lockdown, which has the drawback of slowing down the economy of the involved countries, and by identifying and isolating the positive individuals, which, instead, is an hard task in general due to the lack of information. For this specific disease, the identificato of the infected is particularly challenging since there exists cathegories, namely the asymptomatics, who are positive and potentially contagious, but do not show any of the symptoms of SARS-CoV-2. Until the developement and distribution of a vaccine is not yet ready, we need to design ways of selecting those individuals which are most likely infected, given the limited amount of tests which are available each day. In this paper, we make use of available data collected by the so called contact tracing apps to develop an algorithm, namely PPTO, that identifies those individuals that are most likely positive and, therefore, should be tested. While in the past these analysis have been conducted by centralized algorithms, requiring that all the app users data are gathered in a single database, our protocol is able to work on a device level, by exploiting the communication of anonymized information to other devices.
摘要:SARS-CoV-2的爆发改变了世界上几乎所有人的日常生活。目前,我们面临着遏制病毒传播的问题,一方面采取更有效的强制封锁措施,这种措施的缺点是减缓有关国家的经济,另一方面查明和隔离积极的个人,由于缺乏信息,这在总体上是一项艰巨的任务。对于这种特殊的疾病,鉴定感染者是特别具有挑战性的,因为存在组织学,即无症状者,他们呈阳性,可能具有传染性,但没有表现出 SARS-CoV-2的任何症状。在疫苗的开发和分发尚未准备就绪之前,我们需要设计出选择那些最有可能感染的个体的方法,因为每天可用的测试数量有限。在本文中,我们利用所谓的接触追踪应用程序收集的可用数据来开发一种算法,即 PPTO,来识别那些最有可能是阳性的个体,因此,应该进行测试。虽然过去这些分析都是通过集中式算法进行的,要求所有的应用程序用户数据都收集在一个单一的数据库中,我们的协议能够在设备级别上工作,通过利用匿名信息与其他设备的通信。
人口迁移和间断封锁
对 SARS-CoV-2传播的影响
原文标题:
Effect of population migration and punctuated lockdown on the spread of SARS-CoV-2
地址:
https://arxiv.org/abs/2006.15010
作者:
Ravi Kiran,Madhumita Roy,Syed Abbas,A. Taraphder
Abstract:Once past the lockdown stage in many parts of the world, the important question now concerns the effects of relaxing the lockdown and finding the best ways to implement further lockdown(s), if required, to control the spread. With the relaxation of lockdown, people migrate to different cities and enhances the spread of the virus. In the present work we study a modified SEIRS model with population migration between two cities: a fraction of population in each city is allowed to migrate. Possible infection during transit is also incorporated - a fraction of exposed population becomes infected during transit. A punctuated lockdown is implemented to simulate a protocol of repeated lockdowns that limits the resurgence of infections. A damped oscillatory behavior is observed with multiple peaks over a period.
摘要:在世界上许多地方,一旦通过了封锁阶段,现在的重要问题就是放松封锁的效果,以及如果需要的话,找到实施进一步封锁的最佳方法,以控制传播。随着封锁的解除,人们迁移到不同的城市,加速了病毒的传播。在本文中,我们研究了两个城市之间人口迁移的一个修正的 SEIRS 模型: 每个城市的一部分人口允许迁移。在运输途中可能受到感染的情况也包括在内——一小部分受感染的人口在运输途中受到感染。一个间断的封锁被实现来模拟一个重复封锁的协议,这限制了传染病的死灰复燃。在一个周期内观察到具有多个峰值的阻尼振荡行为。
高阶累积量的堆积修正
原文标题:
Pileup corrections on higher-order cumulants
地址:
http://arxiv.org/abs/2006.15809
作者:
Toshihiro Nonaka,Masakiyo Kitazawa,ShinIchi Esumi
Abstract:We propose a method to remove the contributions of pileup events from higher-order cumulants and moments of event-by-event particle distributions. Assuming that the pileup events are given by the superposition of two independent single-collision events, we show that the true moments in each multiplicity bin can be obtained recursively from lower multiplicity events. In the correction procedure the necessary information are only the probabilities of pileup events. Other terms are extracted from the experimental data. We demonstrate that the true cumulants can be reconstructed successfully by this method in simple models. Systematics on trigger inefficiencies and correction parameters are discussed.
摘要:提出了一种从高阶累积量和逐事粒子分布的矩中剔除堆积事件贡献的方法。假设堆积事件是由两个独立的单碰撞事件叠加而得到的,我们证明了每个多重系统中的真矩可以从较低的多重系统中递归地得到。在校正过程中,所需要的信息只是连续事件的概率。从实验数据中提取了其他项。我们证明了这种方法可以成功地在简单的模型中重建真实的累积量。讨论了触发效率低和校正参数的系统性。
从随机矩阵理论到泊松涨落的
海洋表面温度全球相关矩阵谱
原文标题:
Global correlation matrix spectra of the surfacetemperature of the Oceans from Random MatrixTheory to Poisson fluctuations
地址:
http://arxiv.org/abs/2006.16001
作者:
Eucymara F. N. Santosa,Anderson L. R. Barbosa,Paulo J. Duarte-Neto
Abstract:In this work we use the random matrix theory (RMT) to correctly describethe behavior of spectral statistical properties of the sea surface temperatureof oceans. This oceanographic variable plays an important role in theglobalclimate system. The data were obtained from National Oceanic and Atmo-spheric Administration (NOAA) and delimited for the period 1982 to 2016.The results show that oceanographic systems presented specific
摘要:在本工作中,我们使用随机矩阵理论(RMT)来正确地描述海洋表面温度的光谱统计特性的行为。这一海洋变量在全球气候系统中发挥着重要作用。数据来自美国国家海洋和大气与大气管理局(NOAA),划界时间为1982年至2016年。结果表明,海洋学系统给出了特定的有效值,可根据海洋的相关行为对各个海洋进行分类。中、北、南相关矩阵的邻域间距。
来源:集智斑图 编辑:王建萍
近期网络科学论文速递
英国新冠肺炎禁闭: 对空气污染有什么影响 | 网络科学论文速递21篇
新型冠状病毒肺炎在不同社区传播的 SIR 模型假设 | 网络科学论文速递30篇
利用瞬态动力学和扰动,推导动力系统因果网络 |网络科学论文速递25篇
集智俱乐部QQ群|877391004
商务合作及投稿转载|swarma@swarma.org
◆ ◆ ◆
搜索公众号:集智俱乐部
加入“没有围墙的研究所”
让苹果砸得更猛烈些吧!